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Control of systematic uncertainties in the use of Type la supernovae as standardized distance 
indicators can be achieved through contrasting subsets of observationally-characterized, like super- 
novae. Essentially, like supernovae at different redshifts reveal the cosmology, and differing super- 
novae at the same redshift reveal systematics, including evolution not already corrected for by the 
standardization. Here we examine the strategy for use of empirically defined subsets to minimize 
the cosmological parameter risk, the quadratic sum of the parameter uncertainty and systematic 
bias. We investigate the optimal recognition of subsets within the sample and discuss some issues 
of observational requirements on accurately measuring subset properties. Neglecting like vs. like 
comparison (i.e. creating only a single Hubble diagram) can cause cosmological constraints on dark 
energy to be biased by la or degraded by a factor 1.6 for a total drift of 0.02 mag. Recognition of 
subsets at the 0.016 mag level (relative differences) erases bias and reduces the degradation to 2%. 



I. INTRODUCTION 

Distance-redshift measurements of Type la supernovae 
(SN) provide direct mapping of the cosmic expansion his- 
tory. The peak brightness of most SN have tighter disper- 
sion than any other cosmological object and this can be 
standardized with a simple light curve amplitude-width 
relation, first established by [l| in the early 1990s. This 
allows a SN to be cahbrated to 0.15 mag or about 7% 
in distance, and provided the technique to discover the 
accelerated cosmic expansion in the late 1990s [1, Q. See 
[3| for a review as of 2001. For revealing the nature of the 
physics causing the acceleration, generically called dark 
energy, SN have continued to play a central role (e.g. 

UMmMMMMMM)- 

The limits to the standardization for SN are not 
known; a second parameter to further reduce the intrinsic 
dispersion is actively sought among the SN observables 
(see, for example, and more detailed measurements 

in spectroscopy and a wider range of wavelength bands 
may turn up new observables and correlations. The un- 
certainty on cosmological parameters improves as the in- 
trinsic scatter decreases, both more rapidly than linearly 
as the reduced dispersion further improves color and dust 
corrections, and less rapidly as measurement uncertain- 
ties remain. 

Reduction in scatter can also be achieved by charac- 
terizing each supernova with a detailed array of measure- 
ments, expecting that supernovae with identical observed 
properties must also have identical intrinsic luminosities. 
These empirical observations can define subsets of SN. 
Note that the converse does not necessarily hold - SN 
that differ in some property, e.g. position in the host 
galaxy or its metallicity, may not diverge in luminosity 
(see [15] for one recent study) . This was referred to as the 
mapping of subsets (empirical differences) to subclasses 
(intrinsic luminosity differences) in [l6j . 

Mere differences in luminosities are not sufficient to 
affect cosmological parameter estimation, since they will 
be absorbed into the "nuisance" fit parameter for the in- 
trinsic luminosity (which will be impacted). A further 



ingredient must be present: population drift, or evolu- 
tion of the relative fraction of each subclass with redshift. 
Note that SN are not per se aware of the Hubble expan- 
sion: the explosions and radiation transport take place 
on scales 10^^^ times smaller than the Hubble length. So 
SN should not evolve in a cosmic sense; rather they may 
be affected by their immediate environment and progen- 
itor conditions. Since the full diversity of environments 
from higher redshifts also exists at low redshifts (e.g. stars 
and galaxies continue to form today), only the propor- 
tion of different environments changes, population drift 
is a more accurate description of possible changes in SN 
luminosity. 

It is important to note here that while subsets of SN 
have been recognized, and the proportion of some sub- 
sets has been seen to change with redshift, current data 
show no definite indication that SN luminosity evolves - 
other than is automatically corrected for in using a stan- 
dard single parameter light curve amplitude-width rela- 
tion. That is, we know of no subsets that are subclasses. 

This article looks to the future when suites of obser- 
vations on large samples of SN, more detailed measure- 
ments than we have on any individual SN today, may 
show that indeed some subset, defined through those ob- 
servational characteristics, is a subclass having a differ- 
ent luminosity. The basic method of comparing subsets 
of like SN - likes vs. likes, or SN demographics - was ex- 
plained clearly in [l3| , and we follow this approach while 
extending it to calculating detailed effects on cosmologi- 
cal parameter determination. 

Systematics is emphatically the name of the game in 
accurate science. Understanding the level of control is es- 
sential: without an intrinsic floor, SN arc only limited by 
cosmic variance (from the number of SN within a Hubble 
volume) to 0.003% in distance precision. And of course 
a biased answer can be worse than an imprecise one. 

For the reader wanting a quick conclusion, see Fig. [31 
In fJTT] we establish the formalism of subclass luminosity 
functions and calculate the effects of population drift on 
the mean and variance of the full sample luminosity func- 
tion. Using this in we identify three distinct impacts 
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on cosmology determination, and show that the bias is 
a major effect. In i)IVI we examine the interplay of bias 
and uncertainty as we investigate strategies for control- 
ling systematics, such as adding fit parameters for obser- 
vationally recognized subclasses. We address aspects of 
the observational requirements for identifying subclasses 
in S|Vl and conclude in 

II. SUBCLASSES, POPULATION DRIFT, AND 
MAGNITUDE EVOLUTION 

We begin by considering an observed sample of SN to 
be composed of a set of subsamples that we may or may 
not distinguish. The intrinsic luminosity, or magnitude, 
distribution of the overall SN population at some red- 
shift is a sum over all the individual subset luminosity 
functions. That is 

a>(i,z)-<5(L-^/,(z)L,) Y^mML) (1) 

where (pi is an individual subset luminosity function, Li 
the mean luminosity of that subset, and fi{z) the fraction 
of the total population sample that subset represents at 
redshift z. 

Note again that is the mean luminosity: we are not 
imposing that the subset is a subclass with standard lu- 
minosity^, only that the mean luminosity is independent 
of redshift. This still places a strong burden on excellence 
of observations and requires something in between an em- 
pirically defined subset (since we need some knowledge of 
the luminosity behavior, i.e. the subset of SN discovered 
on a Tuesday is insufficient) and a subclass. We discuss 
this challenge further, and how to handle deviations, in 
For now we continue to call it a subset, and ef- 
fectively each subset's Li represents a different absolute 
magnitude Aii- 

Given Eq. ([1]) for the probability distribution function 
we can calculate whichever moments of the total lumi- 
nosity distribution desired, in terms of moments of the 
individual subset luminosities, without requiring any as- 
sumption of, say, a Gaussian form. The drift of the mean 
luminosity of the sample, relative to the value at some 
redshift zq, is 

(Liz) - L{zo)) = J2 iM^) - /*(^o)], (2) 

where we define the offset of each subset mean luminosity 
as 

5L,{z,)^L,-L{z^). (3) 



^ Of course more tightly defined subsets should have smaller dis- 
persion about the mean, and poorly characterized subsets, or 
those defined through variables unrelated to the luminosity, may 
have larger dispersion such that the difi'erence between subsets' 
luminosities is smeared out, but this does not affect the formal- 
ism. See ifVl where we return to discussion of these points. 



We are free to evaluate the subset offset relative to some 
other redshift z^, though generally we will take zq = 2* = 
0. 

The result for the variance of the total sample lumi- 
nosity is 

al{z) ^ {L\z)) - {L{z)f (4) 

+ Y.h{z)6Lj{z.)-\^U{z)5L,{z.)]^ . 

The first term is a subset-weighted dispersion, where af 
is the luminosity variance of subset i, and the final two 
terms are contributions from the offset of the mean subset 
luminosity relative to the mean sample luminosity. If the 
offsets 6Li are zero (if they are equal, they must be zero 
by the delta function in Eq. [T]), then these bias terms 
vanish. 



III. EFFECTS OF THE MAGNITUDE 
DISTRIBUTION ON SUPERNOVA COSMOLOGY 



Recognizing subclasses of SN can have three effects on 
the calculation of cosmological parameter uncertainties: 
it might 1) reduce the dispersion of the sample used in 
the Hubble diagram, 2) reduce the residual systematic 
error, 3) reduce cosmology parameter bias if analyzed in 
the proper way. 

For the first effect, let us first consider the influence 
of the offset terms in Eq. The following argument 
indicates they likely do not have a substantial impact. 
Consider two subsets, offset in absolute magnitude by 
(5mi2. (For the remainder of the article we phrase the 
analysis in terms of magnitudes rather than luminosities; 
for small differences in subset luminosities one can make 
a direct substitution in the formulas.) Then 

- + /i - ^2) + ^™?2 - /i), (6) 

where /i is fraction of the population in subset 1 (and 
1 — /i is in subset 2). Since the maximum of /i(l — /i) is 
1/4, and the magnitude offset should be (much) less than 
the dispersion, the last term is unlikely to be important. 
For (Ti ~ f72 , we simply have that the dispersion of the 
sample is nearly the dispersion of the subsets. 

Next within effect 1 we consider when the dispersion in 
a subsample is reduced. While this has a mild effect on 
the variance of the full sample, we can imagine a Hubble 
diagram formed only from the subsample (as suggested 
for example for elliptical galaxy hosted SN, though this 
reduces the external systematic of dust extinction not 
the internal luminosity variation). This will have fewer 
data points, decreasing the cosmological leverage in op- 
position to the lesser dispersion. In the statistical error 

1/2 

regime, the error is effectively Uij \fNi ^ Oij , so the 
subset must account for a sizeable fraction of the pop- 
ulation over a wide range of redshifts in order for this 
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subdiagram to improve in precision over the full Hub- 
ble diagram. For example, if af^u = 0.15 and cri=0.1, 
one requires /i > 0.44. Data from ongoing large sur- 
veys, such as the Supernova Legacy Survey (SNLS [11]), 
the Nearby Supernova Factory (SNf and the CfA 

Supernova Archive [IB], that characterize SN properties 
in detail may lead to such reduced-dispersion subsets. 
However, future surveys are likely to be fundamentally 
limited by systematics. In the systematic error regime, 
while one has supernovae to spare and could use only 
those from the reduced-dispersion population, the sys- 
tematics dominate over the statistical dispersion and the 
subset Hubble diagram does not help, unless effect 2 en- 
ters, reducing systematics. 

As we obtain more incisive measurements of the super- 
nova sample, characterizing each SN in more detail, we 
can potentially reduce the systematic uncertainties. Re- 
call that systematics refers to the uncertainties remain- 
ing after correction procedures have been applied, so the 
more information about a SN, the better chance for cor- 
rections to work to a deeper level. In the systematics 
dominated regime, an improvement by a factor of two in 
systematics leads to a factor of two tighter constraints 
on cosmological parameters. This is a strong reason 
for gathering a large suite of measurements to recognize 
subsets. However, not all systematics in an experiment 
arise from source properties - instrumental errors such as 
from filters and calibration also enter. The effect of sub- 
set recognition on residual systematics requires detailed, 
experiment-specific simulations for quantitative answers. 
Thus, although such data holds considerable promise for 
improving supernova probes, in this paper we concen- 
trate on the third effect, cosmological parameter bias and 
degradation, treated in the next section. 



IV. MINIMIZING SYSTEMATICS IMPACT ON 
COSMOLOGY DETERMINATION 

Without recognition of subclasses, population drift 
among them will appear as evolution in the mean ab- 
solute magnitude of the sample. Again, the differential 
population demographics is key - mere constant differ- 
ences between subclasses are absorbed completely into 
the fit parameter for the absolute magnitude, M, and 
do not impact the cosmological parameters. The bias in- 
duced by the magnitude evolution on the cosmological 
parameters can be evaluated at the same time as the pa- 
rameter estimation uncertainties with standard Fisher, 
or information, matrix techniques. 

Specifically, the bias on parameter pi is 

Sp^ = (F-%Y.^^^^'^' (7) 

where m^, is a supernova magnitude, Am^ the offset due 
to the effective evolution, and F is the Fisher matrix 
over parameters p. For simplicity, we write this for a 



diagonal error matrix with entries ak', see [2l| for the 
generalization. 

As an example of the importance of accounting for 
bias, note that for two populations differing in absolute 
magnitude by 0.02, with one dominating at low redshift 
and the other at high redshift, the cosmological parame- 
ter bias can amount to greater than one statistical sigma. 

Only two ingredients are required for determining the 
impact on cosmology: the mean absolute magnitude of 
each population. Mi, and the population fractions or de- 
mographics fi{z), which combine together to form the 
offset A™ through Eq. 

Am(z)=^AM,[/,(z)~/,(0)], (8) 

where AM = — 2.5 log(5Li(0). The first ingredient is 
of course not known from observations, while one could 
measure the populations fi from the data itself (we dis- 
cuss this further in ^V|. We consider several different 
models for each and examine the range of cosmology im- 
pacts. 

To remove a bias induced by different Mi, one could 
introduce additional fit parameters for them (or equiva- 
lently for A^i = Mi — 5 log /i, where h is the dimensionless 
Hubble constant). This of course only applies to those 
subsets that are recognized, i.e. empirically distinguished 
by the values of a certain set of measurements (for exam- 
ple, high line velocity, elliptical galaxy host, strong ultra- 
violet flux, etc.). The more subsets recognized, and fit 
parameters introduced, the less bias in the cosmology de- 
termination, but the more uncertainty in the estimation 
of the cosmology parameters due to the larger parameter 
space. 

Explicitly, if there are N subsets and we have the ob- 
servational acuity to recognize R of them, then we can fit 
for M. (representing the gross subsample of unrecognized 
subsets) and Mi,. . . Mr and suffer a cosmology bias 

N 

Am{z)= J2 AX,[/,(z)-/,;(0)] (9) 

i=R+l 

due to the N — R unrecognized subsets. The question 
then is simply which wins out: improved precision from 
fitting for fewer parameters, or improved accuracy from 
reducing bias. That is, what is the optimum value for R 
(given the properties Mi, fi{z) of the subsets). 

To take into account both the dispersion and bias in 
parameter estimation, a standard statistical tool is the 
risk [ill , the square root of the quadratic sum of the two 
terms, i.e. 

Risk(p) = + . (10) 

We analyze the risk as a function of magnitude offsets, 
population model, and subsets recognized, seeking the 
optimal strategy for supernova cosmology - is it better 
to have a single Hubble diagram of all supernovae, which 
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will have tight but biased parameter constraints, or to di- 
vide the sample into the maximum number of recognized 
subsets, giving looser but less biased cosmology determi- 
nation. 

For the population model we adopt the form 

Mz)^MQ) + A^{z/l.7f^. (11) 

(See WDI and the Appendix for generalizations.) This is 
subject to the constraint that the populations sum up to 
the total sample, J2 fii^) — 1 ^^r all z, which is easiest 
to implement if Bi = B. We consider B = 1/3, 1, 3 
to cover a range of behaviors. Figure [T] illustrates these 
population evolutions, giving respectively a high rate of 
change at low redshift, even weighting, or a high rate at 
high redshift. While one could consider scale factor or 
cosmic time as the independent variable, this is not qual- 
itatively different from changing the value of B. Also, 
as a concrete example note that the population drift in 
the mean stretch parameter seen by |23| follows a linear 
redshift dependence {B = 1) from z = 0.03 — 1.12. 
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FIG. 1: The population drift model is designed to cover a 
range of behaviors from rapid evolution at low redshift (B < 
1), to linear evolution {B = 1), to rapid evolution at high 
redshift {B > 1). The coefficient A sets the amplitude of the 
drift. 

For the absolute magnitudes of the individual subsets 
we take them to differ from the mean absolute magni- 
tude by ±X, zL2X, considering four subsets. A constant 
shift Ai in the magnitudes is simply absorbed into the 
absolute magnitude nuisance parameter - if all subsets 
are accounted for. The mean absolute magnitude is rel- 
evant when only some subsets are recognized, since then 
the sum of those populations can be redshift dependent 



(i.e. not unity, or zero). Explicitly, 

J2'{^M,+M)[Mz)^M0)] (12) 
= A,7i(z)+Al^'[/,(z)-/,(0)], 

where a prime denotes the sum runs over unrecognized 
subsets. But the last term is zero only when fi{z) = 
1 for all z, i.e. all subsets are included in the sum (all 
unrecognized), or the sum is trivially zero (all subsets 
recognized). 

The cosmology bias will scale with the subset magni- 
tude offsets so we can express the results as a function 
of the effective magnitude evolution in the full sample. 
That is, we can phrase the offset amplitude X in terms 
of Am(z = 1.7), say. In the specific example treated 
below, we take {fi{z = 0)} = {1/4,1/4,1/4,1/4} and 
{fi{z = 1.7)} = {1/2,3/8,1/8,0}, with the population 
drift rate determined by the value adopted for B, as in 
Eq. ini). In this case, Am{z) = (5A:/4)(z/1.7)^. 

To analyze the cosmological impact, we must take into 
account both the cosmology parameter estimation and 
the bias. To do this compactly, we adapt the "area figure 
of merit" to the full risk. Here, the dark energy equation 
of state w{a) = wq + Wa {I — a), where a = 1/(1 -I- z) 
is the cosmic expansion factor, and the area of some 
likelihood contour in the WQ-Wa plane is taken as the 
area figure of merit. In practice, one equivalently quotes 
l/[a{wa) X a{wp)], where Wp is the pivot value, the value 
of w at the redshift where the uncertainties in wq and 
Wa are uncorrelatcd. To incorporate parameter biases 
Sp, we define the risk figure of merit^ from Eq. pO]) as 
l/[Risk(wa) X Risk(wp)]. 

Now we can quantify to what extent it is advantageous 
or not to rigorously define subsets through detailed obser- 
vations. Figure [5] illustrates the effect on the dark energy 
parameter determination. This combines simulated high 
quality data from 2300 SN between z = 0—1.7 with 
Planck CMB information to estimate the cosmological 
parameters. If we somehow knew that all subsets had the 
same mean absolute magnitude, i.e. that no magnitude 
evolution were possible, then the figure of merit is simply 
the usual area of merit and is shown by the horizontal line 
labeled "ideal" . If we use only a single Hubble diagram, 
making no effort to, or failing to, recognize subsets, then 
the degradation in figure of merit is severe, shown by the 
solid, black curves. These show the best and worst cases 
of the values of B used in the population drift. For a to- 
tal effective evolution to z = 1.7 of 0.02 mag, the single 



^ In many circumstances this is a conservative estimate of the dam- 
age. One could define an area taking into account all possible 
shifts of the likelihood contour due to bias, as effectively adding 
to the uncertainty. This area increase is often larger than the 
effective area increase from the risk, but it is dependent on the 
Ax^ level of the confidence contour considered, and so we stay 
with the well defined risk statistic. 
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Hubble diagram approach degrades the cosmology con- 
straint by a factor of 1.3-1.6. No increase in the number 
of SN can fully make up for this degradation, assuming a 
systematic floor of dmgys ~ 0.02(1 + z)/2.7. Even worse, 
while larger numbers of SN will tighten the precision they 
will increase the relative bias on the cosmological param- 
eters. 



ideal 




Drift Am(z=1.7) 

FIG. 2: Recognition of like SN subsets has significant impact 
on the dark energy figure of merit incorporating the trade off 
between precision and bias. Unrecognized population drift 
induces evolution in the SN magnitude, Am{z), and bias in 
the cosmological parameters, while adding a fit parameter for 
recognized subsets costs in precision. For a single (full sam- 
ple) Hubble diagram, the degradation in figure of merit due 
to bias can be substantial, as shown by lowest solid curve. 
For each case we plot the envelope of worst and best results 
scanning over the population evolution and permutation of 
subsets recognized. Maximizing the number of subsets rec- 
ognized is the optimum strategy except for very small drifts, 
and even then the cost is less than 2%. 

Recognizing 1 (dotted, red curves) or 2 (short dashed, 
magenta curves) of the 4 subsets acts to improve the sit- 
uation. (The case of 3 subsets recognized is equivalent to 
that of all recognized, since the remainder of the sample 
is simply the fourth subset.) Here the upper and lower 
curves represent the best and worst of not only varia- 
tion over i?, but also the permutations of which of the 
4 subsets are recognized. That is, identifying the subset 
with the most extreme magnitude offset is most useful, 
while one with an offset little different from the mean is 
of marginal effect. Indeed, if we recognize the two most 
extreme subsets, we approach the perfect situation, while 
finding the two least extreme ones only improves by 14% 
over the worst case of the single Hubble diagram (also 
see WJ^ . 



Finally, we consider sufficiently good observations to 
recognize all subsets (long dashed, blue curves). In this 
case we must fit for 4 different Ai parameters, and the key 
question was whether the elimination of bias was worth 
the loss in precision due to the expanded parameter set. 
The answer is emphatically yes - the figure of merit is 
only 1.7% below the ideal case. This represents up to 
a 55% improvement over using exactly the same SN in 
a single Hubble diagram (see Fig. [3] for a clear view of 
these essential points). Moreover, the answer obtained 
represents the true cosmology without a bias. Only when 
the evolution is extremely small, Am{z = 1.7) < 0.005, 
which we do not know a priori, do we fail to gain by 
employing the likes vs. likes approach, where again the 
highest cost is less than 2%. 

I I I I I I I I I I I I I I I I 




Drift Am(z=1.7) 

FIG. 3: Same as Fig. [2] but a simpler version containing only 
the extreme cases to bring out the essential result. 



V. OBSERVATIONAL REQUIREMENTS ON 
DEFINING SUBCLASSES 

While the optimum survey design would be to obtain 
a full suite of observations that enables recognition of all 
subclasses, this cannot always be realized. In this section 
we examine three cases of less than perfect observations 
and investigate the implications for cosmology determi- 
nation. Finally, while the issue of exactly how to define 
subclasses is complex and largely unknown, we discuss 
generically some possible routes toward this. 

The ability to recognize subsets depends on the acuity 
of the observations. This in turn depends on the instru- 
mentation, exposure time, types of data collected, etc. 
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While this is too complex an area to explore here, we 
can get an idea of the effect on cosmology through toy 
models in the next three subsections, exploring respec- 
tively the degree of difference between subsets, overlap 
and confusion, and a continuum of subclass properties. 



A. Separated Subclasses 

Given disjoint subsets, with absolute magnitudes off- 
set from the mean, one is most likely to recognize those 
subsets that are most discrepant. Since these also tend 
to induce the greatest cosmological parameter bias (de- 
pending on the evolution of the subset fraction /j(z)), 
this can mean that recognizing merely the most extreme 
subsets helps substantially toward removing bias. 

For example, in the results of Fig. [2] we find that rec- 
ognizing the two most discrepant subsets gives a 28-52% 
improvement over recognizing none (for a drift of 0.02 
mag out to z = 1.7), while recognizing the two least dis- 
crepant only improves by 8-14%. The absolute level from 
recognizing the two most discrepant subsets approaches 
2.5% below the ideal case. For recognizing a single sub- 
set, the improvements are 17-29% and 4-7% for the most 
and least discrepant, respectively, and recognizing the 
most discrepant subset brings the figure of merit within 
11% of the ideal case. 

To examine this further, we consider the effect of in- 
creasing the ability to resolve the subsets. For example, 
if we believe that some observable such as line velocity 
correlates with luminosity, then we need to have the ca- 
pability to make sufficiently accurate measurements of 
this variable. As a toy model we take subsets with abso- 
lute magnitudes distributed at X, — (3/4)X, —X/2, and 
X/A relative to the mean at z = (recall from i jllll the 
sum of the AAl 's is taken to be zero) . We then consider 
experiments with varying ability to resolve discrepancies 
from the mean and ask at what degree of difference does 
the figure of merit degrade by a certain percent. 

As the resolution degrades past the smallest degree 
of difference of a subset from the mean, that subset is 
no longer recognized per se, but can still be identified 
as the "leftover" from all the other subsets. Once the 
next subset threshold is passed, however, then the cos- 
mology determination degrades, and so on as the reso- 
lution coarsens, until no subsets can be recognized. Fig- 
ure m shows the behavior for the case of four subsets as 
before, though with the absolute magnitude distribution 
as above. 

The resolution required for no more than 10% degrada- 
tion in dark energy figure of merit is at the level of 0.019, 
0.028, 0.042 mag for total drifts Am(z = 1.7) = 0.01, 
0.02, 0.03 respectively. To limit the degradation to 20% 
one requires resolution of 0.037, 0.042 for Am{z = 1.7) = 
0.02, 0.03 (for only a 0.01 evolution in magnitude, the 
bias does not become large enough to reduce the figure of 
merit by 20%). In fact, these numbers are too optimistic 
in that one would need to see the subset deviate from 
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FIG. 4: Finite resolution in measuring supernova character- 
istics translates into a finite resolution for distinguishing sub- 
sets from the mean sample luminosity. For three levels of mag- 
nitude evolution we plot the degradation in dark energy figure 
of merit as resolution coarsens and subsets become unrecog- 
nized. A rough rule of thumb is that an observational reso- 
lution less than the total evolution sensitivity to be probed, 
a{AM) < Am{z = 1.7), is needed. 



the mean at 2-3(7 for robust recognition and suppression 
of normal outliers. In general, if the survey aims to be 
sensitive to an evolution at some level Am{z = 1.7) (with 
the associated cosmological bias and leverage), then the 
observational resolution should be designed to be some- 
what finer, cr(M) « (1/2) Am{z = 1.7). 

Of course when the experimental resolution weakens 
and makes discrimination of subsets from the mean diffi- 
cult, the uncertainty effectively broadens the subsets and 
could cause them to overlap. This is a situation distinct 
from straightforward recognition or not, and we discuss 
it next. 



B. Overlapping Subclasses 

So far we have discussed the subsets as either recog- 
nized or unrecognized, and assumed that the recognized 
subsets are distinct. Subsets however can have some lu- 
minosity function distribution, as mentioned in ^HIl and 
the recognition can be fuzzy. For example, two sub- 
sets may possess overlapping luminosity functions and a 
member of one subset might be misassigned to another. 
This will change the fraction in each subset away from 
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the true value, inducing a cosmology bias 



Sm(z) 



J2 AM..[SMz)-SMO)], 



(13) 



recognized 



where Sfi is the misestimated population fraction. Note 
that here the sum runs over recognized subsets, unlike in 
Eq. ([9]); if the subsets are not recognized to begin with, 
then the mixing has no effect. Here AMi represents the 
true (though unknown) mean absolute magnitude of each 
subset, which we take to be unaffected by the misassign- 
ment (we consider fuzziness in both subset population 
and absolute magnitude in pv Cp . 

We examine the consequences for a model with a frac- 
tion /i2 of the total sample, belonging to subset 1 but 
overlapping with subset 2, having a probability /i^2 of 
these being misassigned. In general, the misestimation 
gives 



(14) 



Summing over all subsets (including the unrecognized 
ones) enforces that ^ Sfi = 0, i.e. a supernova lost from 
one subset shows up in another, or in the main undiffer- 
entiated group. 

First consider the case of two overlapping subsets, and 
two extremes. If the misassignment leads to a complete 
swap of one subset with another, so that fji — fj and 
Pj^^ = l,thendm(z) ^ X {A2- Ai)){z/1.7)^ where the 
absolute magnitudes of the two subsets differ by X. For 
the parameters of mVl this amounts to dm{z — 1.7) — 
—0.00125, which is insignificant. As the other extreme, if 
instead of a swap, a transfer occurs, i.e. a one-sided loss, 
with Pi^2 = 1, P2^i = 0, /12 = fi then dm{z = 1.7) — 
-XAi = -0.0025. 

These bias effects from recognized, but overlapping and 
confused, subsets will add to the bias due to the un- 
recognized subsets. The overlap contribution is small, 
however, because one is not mistaking the absolute mag- 
nitude by the full deviation from the sample mean but 
only by the amount to the nearest subset's magnitude; 
furthermore, the biases from each subset are not additive 
but are differenced during an exchange. Because the con- 
fusion is due to intrinsic luminosity function width then 
the observational resolution does not play a major role 
as in the previous section. Indeed, with fine resolution 
one might be tempted to subdivide the sample into more 
subsets, which could lead to more overlaps, but such mul- 
tiplicity of subsets further reduces the fractions Ai and 
so the overlap biases are even smaller. 



C. Continuous Subclasses 

Many supernova properties are not discrete, but con- 
tinuous, and the subset categorizations may not be well 
separated, as discussed in the last two subsections. The 
limit of fuzziness in supernova properties is a continuous 



subclass distribution. Here the sample is a cloud in some 
multidimensional observational data space and the abso- 
lute magnitude is a function of the location in that space. 
We assume this is deterministic, so improved knowledge 
of the properties leads to a tighter distribution for the 
absolute magnitude. This means that bias due to unrec- 
ognized subsets is replaced by bias due to unpinpointed 
properties. In the completely unlocalized case (where ob- 
servations are too weak to determine the location within 
the cloud, e.g. missing some type of observation can make 
the uncertainty in some dimension span the entire range) , 
this is equivalent to the case of no recognized subsets, i.e. 
a single Hubble diagram. 

We can adapt the formalism of the discrete subsets 
by taking the sum over subsets to an integral over con- 
tinuous variables. If if represents the multidimensional 
parameter set over properties xi,. . . xtv (e.g. metallicity, 
velocity decline rate, silicon line ratio, etc.), then 

Am(z)= J dn AM in) [fiTf,z)- f (71,0)]. (15) 

For simplicity, we first consider a one dimensional 
space over a continuous property parametrized by x. 
Both AA^ and / will be functions of x. Taking 

fix, z) - fix, 0) = AFix) (z/1.7)^ , (16) 

the mean value of the parameter x drifts from xq = 
J dxxfix,0)/ J dxfix,0) to 



{x)iz) = xo + i—)'' 



dxxAFix). 



(17) 



This drift causes a change in the mean absolute mag- 
nitude of the full sample (recall there are no individual 
subsets in this approach), AA^((2;)(z)). 

To evaluate Eq. further, we must adopt forms for 
AFix) and AMix). Suppose 



AFix) = Af ix"" ~~ x"}.) 
AMix) = AMixP-xlj), 



(18) 
(19) 



so the values xf, xm define the standards. For exam- 
ple, if X represents metallicity, then a supernova with 
X — Xm defines the baseline in absolute magnitude, and 
supernovae with x = xf maintain a constant population 
fraction, i.e. do not drift (the value xf does not have to 
actually be realized in the sample). As the value of x 
deviates from the standards, the demographics changes 
according to Eq. (jlSp . with lower metallicity supernovae 
becoming more common at high redshift, say, and their 
luminosity changing according to Eq. (jl9p . with lower 
metallicity supernovae being brighter, say. 

The completeness conditions are JdxAFix) = 
J dx AMix) = 0, i.e. every supernova lies somewhere 
in the parameter space and there is a mean absolute 
magnitude for the sample. Then over some finite range 
X e [a;_, x+J, 



Xf 



1 



x+ 



1/n 



(20) 
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and the equivalent for xm with p substituting for n. 

The magnitude offset generating bias in the cosmolog- 
ical parameter estimation then takes the form for the 
continuous case 

Am(z) = AfAm {z/1.7)^ [ dx (a;" - x'^){xP - x^^) . 



(21) 

For n = p = I this gives 

Ato(z) = (l/12)Ai.AM {x+ - x-)^{z/1.7)^ . (22) 
We recognize the maximum drift for the sample is 



AF„ 



with a similar expression for 



AM, so Am(z) = {1/U)AF^,^AM^,^{z/1.7)B . An 
analogous expression holds for other values of n, p. 

To generalize to a multidimensional parameter space 
over xi, . . . , xn, we have 

Am(z) = (z/1.7)^F-i^A;.,,Am,. X 



da;,(x--x-)(xf (23) 



where V-n — ^ dxi . . .dxN is the parameter space volume 
and we have assumed the parameters independent 
of each other. 

So far we have considered that we can measure the ob- 
servational parameters x with perfect accuracy and with 
this possibly determine the demographics f{x). Of course 
if knowing x allows us to predict AA^ (x) as well, then we 
can compute Am{z) and there will be no cosmology bias. 
But now we consider the case where the measurements 
are not perfect but have some uncertainty 5x, which will 
propagate through the demographics and absolute mag- 
nitude into the magnitude offset. This is similar to the 
"fuzzy" philosophy of WBl 

The measurement imprecision will lead to an magni- 
tude uncertainty 



5m{z) — I dx5x{x) 



dAMjx) 
dx 



AMix) 



[f{x,z)-f{x,Q)] (24) 



df{x,z) d,fix,0) 



dx 



dx 



This can be viewed as taking place in a multidimensional 
parameter space of x as well. If there is no uncertainty 
in some Xi , or if / and M are independent of Xi then this 
dimension does not contribute to the magnitude system- 
atic. For simplicity we will write the expressions in terms 
of a single continuous parameter x. 

Using the forms of Eqs. (fTH]) . (fTS]) . (fTO]) . we can eval- 
uate the magnitude systematic in Eq. (j24p given some 
observational input for the uncertainty Sx(x). As the 
simplest case, we take Sx constant. For n = p = 1 
the completeness conditions ensure that the systematic 
is zero. The general result is of the form 



The range Ax can be defined through either theoretical 
model limits on the variation of x (e.g. metallicity) or 
as some weighted range that captures 90%, say, of the 
magnitude drift. 

Thus, perfect measurements give no uncertainty in this 
situation where the functional dependences are assumed 
known, but as the observations become more imprecise, 
i.e. dx increases, the magnitude uncertainty grows. Even- 
tually the perturbative formalism used here breaks down, 
but when Sx becomes comparable to Aa; then this ap- 
proach should reduce to the single Hubble diagram case. 

One can remove the bias due to the lack of observa- 
tional resolution by fitting for the form of the magnitude 
offset, e.g. Eq. (I25|) . Two fit parameters are the evolution 
power index B and the prefactor, call it C. If no prior is 
placed on these quantities, then the degradation on dark 
energy parameters is severe, reducing the figure of merit 
to less than 2. In particular, B is poorly determined and 
covariant with the dark energy variables, so that even an 
overidealized prior of 0.002 on C gives a figure of merit 
of only 62. We therefore fix i? = 1 and investigate the 
degradation as a function of the prior on C, essentially 
equivalent to observational resolution Sx/Ax. Figure [5] 
shows the results as a function of this resolution. 



-| , I I I , I I I , r— 

(5m(z) = 0.05(5x/Ax)(z/1.7) 
SN+CMB 




5m{z) w AMr 



6x 
■ Ax 



z 



(25) 



0.4 0.6 
Resolution (5x/Ax 

FIG. 5: For continuous parameters defining supernova sub- 
classes, lack of observational resolution degrades the dark en- 
ergy figure of merit. When the resolution Sx/Ax = 0, then 
the observations exactly determine the supernovae properties. 
When Sx/Ax — 1 then the observations are blurred over the 
entire sample, making this equivalent to using only a single 
Hubble diagram, but with an added fit parameter for the drift 
amplitude, reducing the figure of merit by a factor of ~ 3. 



The figure of merit is rapidly degraded as the obser- 
vational acuity decreases. The effective total magnitude 
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offset here is Sm{z = 1.7) = 0.05 (fe/Ax), so a resolu- 
tion of 0.4 corresponds to 0.02 mag evolution. To defend 
against degradation of more than 20% in the figure of 
merit requires a resolution of 0.27. Of course as more 
fit parameters are added, the requirements will tighten. 
Thus, lack of observational resolution leads directly to 
unpinpointed or confused subclasses and loss of cosmo- 
logical information. 



D. Defining Subclasses 

A central issue mentioned in S|lT] is the consequence 
when a subclass fails, i.e. when a carefully character- 
ized subset does not have an unevolving mean luminos- 
ity. (Recall we don't require the luminosity distribution 
to be independent of redshift, only that the mean stays 
constant.) If the subset is not a true subclass then we can 
absorb the residual luminosity evolution into an effective 
population drift fi{z) via 



n{z) L,{z) ^ Mz) 



L,{0) ^ LM ■ (26) 



So a drift in Li{z) because the subset i is not a true 
subclass can be viewed as an uncertainty in fi{z). We can 
then try to account for this by fitting for fi{z). (Note that 
now the quantity fi{z) is not directly observable.) As a 
fitting function we consider an expansion in Chebyshev 
polynomials over the range z = 0—1.7. We include 
terms through second order so as to allow the possibility 
for non-monotonic behavior, with 



afTj{x 



z/1.7). 



(27) 



200 
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,100 



op 
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2 subsets, f(z) = i;g a, T,(z) 

SN+CMB 



-fix ; prior rJ{a^) 
-prior a{a^) = a{a^ 



0.2 0.4 0.6 
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0.8 



FIG. 6: Unrecognized magnitude evolution can be accounted 
for by fitting for an effective population drift /(z). Here tiie 
evolution is expanded to second order in Chebyshev poly- 
nomials and the fiducial parameters correspond to Am(z = 
1.7) = 0.02 mag. We show the dark energy figure of merit as 
a function of the priors placed on the subset evolution expan- 
sion coefficients. The solid, black curve corresponds to using 
only a first order Chebyshev expansion while the dashed, blue 
curve uses a second order expansion. In the first case the prior 
refers to that on a\ while in the second case we take the priors 
on ai and Q2 to be equal. When no priors are used, the value 
of the figure of merit is shown by the x's at the right axis. 
The magnitude of the priors can be related to an additional 
evolution by (Jmmax = 0.02 [(T(a)/0.5] for either coefficient q^. 



where we normalize the polynomials to the interval [0, 1]. 

Adding such freedom degrades the figure of merit un- 
less tight priors are placed on the amplitude of magnitude 
evolution allowed within the subset. Figure [H] explores 
the effect of fitting for a residual magnitude evolution 
of amplitude Am(z = 1.7) = 0.02, considering two sub- 
sets that fiducially linearly evolve from equal fractions at 
z = to 100% in one subset at z = 1.7 (i.e. the fiducial 
case is ao — 0.5, ai = 0.5, a2 =0). We have further sim- 
plified the situation by taking /2(z) = 1 — fi{z), which 
will not be true in general since fi no longer represent 
physical fractions of the sample. If this were relaxed or 
more subsets were used, the number of fit parameters 
increases and the degradation worsens. 

Without priors on either Chebyshev coefficient, ai or 
Q!2, the figure of merit plunges by a factor 100 to a value 
of 2. Even freely fitting one coefficient lowers the figure 
of merit to 30. Priors on each a of 0.5, corresponding to 
a maximum evolution uncertainty of 0.02 mag from each, 
degrades the figure of merit by a factor 2.2 (1.5 if only 
allowing linear evolution). 



The size of the effect due to residual uncertainty in 
whether the subset is truly a subclass points up the im- 
portance of having a comprehensive suite of precision 
observations. Exactly what these should be is not yet 
known. The supernova spectrum should contain the re- 
quired information (see for example [H, [H, [H, [13, HI] ) ; 
broad band photometry may not be sufficient. Recall 
that ^60% of the bolometric flux is emitted in the rest 
frame BVR bands, so relying only upon rest frame 
ultraviolet or near infrared measurements leaves open 
the possibility that the tail does not move in the same 
way the dog does. Similarly, another area of active re- 
search involves the use of particular spectral features 
[29l . [sol Isil . [3^ . While any of these may prove robust, 
global analysis of the supernova spectrum appears less 
subject to such uncertainties. One way to implement this 
could be through prin cipal component analysis (PCA) for 
example (see [3J, [sl] for early steps). 

PCA could effectively tell us whether the defining sub- 
class variables involve, e.g., line ratios, velocities, velocity 
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changes, etc. While the amount of degradation was sig- 
nificant when adding only two extra fitting parameters 
in the Chebyshev polynomial case, PCA by its nature 
focuses on the most relevant combinations of variations, 
and so may prove a tractable analysis approach in com- 
bination with spectral observations. Indeed, preliminary 
indications point to the first two PCs accounting for 85% 
of the spectral variation [sl] . 



abled by high observational acuity is taking a gamble 
on the astrophysics and supernova properties being kind. 
While we do not yet know exactly what subsets to de- 
fine, the capability and flexibility to do so, as measured 
quantitatively along the lines of the simple calculations 
here, are required to ensure confidence in the cosmologi- 
cal results. 



VI. CONCLUSIONS 

Without any systematics. Type la supernovae would 
be statistically the most powerful tool for probing the 
accelerating expansion of the universe. One of the key 
approaches for controlling systematics is that of likes 
vs. likes, or supernova demographics, carefully compar- 
ing sample properties through a suite of observational 
characterizations. The simple concept is that like super- 
novae at different redshifts accurately reveal the cosmol- 
ogy, while supernovae at the same redshift, differing in 
essential ways, can define subsets giving clues to reining 
in systematics. 

The issue is not one of evolution, but uncorrected evo- 
lution and unrecognized evolution. This article examines 
techniques for evaluating the cosmological consequences 
of systematic control or the lack of it, and strategies for 
implementing such control. The main pitfall is bias of the 
cosmological parameters - this is a bad thing, not just 
because it degrades the effective dark energy figure of 
merit calculated in terms of the risk, but because physics 
is lost. One may end up with an impressively precise but 
simply inaccurate conclusion. 

Having high observational acuity and using all this in- 
formation to define robust subsets is the optimal strat- 
egy. We quantify this and demonstrate that this holds 
even at the price of additional subset parameters in the 
fit. Analyzing the data in a single Hubble diagram can 
lead to biases of order a full statistical sigma and (sec- 
ondarily) loss of figure of merit by a factor 1.6. To avoid 
these consequences, one uses the recognized subsets to 
add fit parameters; this restores essentially all the cos- 
mological leverage, as long as the subsets are sufficiently 
well defined by the observations that these subset mean 
luminosities do not evolve. 

The observational requirements to define the subsets 
is a complex subject but we consider three categories, of 
separated, overlapping, and continuous, or unrecognized, 
confused, and unpinpointed, subsets, and quantify some 
requirements within simplistic models. We further briefly 
consider subsets whose luminosities do in fact evolve and 
speculate that principal component analysis applied to 
supernovae spectra may prove the best path to robust 
control. In the appendix we illustrate how combining sep- 
arate data sets, especially from different redshift ranges, 
can act similarly to evolution and have significant dele- 
terious effects. 

A supernovae survey designed without the controls en- 



Acknowledgments 

I thank Bob Cahn, Ariel Goobar, Dragan Huterer, 
Alex Kim, Peter Nugent, Reynald Pain, and Saul Perl- 
mutter for useful discussions. This work has been sup- 
ported in part by the Director, Office of Science, Office of 
High Energy Physics, of the U.S. Department of Energy 
under Contract No. DE-AC02-05CH11231. 



Appendix: Redshift Distribution Effects 

The redshift dependence of the population drift, or 
more generally the subset distribution, convolves with 
the Fisher sensitivity derivatives dm/dpj in Eq. ((T]) in 
a complicated manner to lead to parameter bias. One 
cannot in general predict analytically how a given form 
for fkiz) leads to a bias. The offset Am{z) beats against 
dm/dpj but the set of dm/dpj do not form a complete 
basis, nor even an orthogonal one. Even if Am{z) had 
exactly the same functional form as some dm/dpj, the 
offset propagates not just to the parameter pj but to all 
the parameters (unless the inverse Fisher matrix in Eq. [7] 
is formed purely from SN magnitude information, with- 
out CMB information or priors). The one exception is a 
redshift independent Am{z), which induces a pure shift 
in Ai since this parameter enters only into SN magni- 
tudes. 

Thus we must calculate the effect of various forms of 
/(z) numerically, and it is important to consider a variety 
of behaviors as we do. In general, we find that nearly 
linear redshift evolution, i? « 1, has the greatest impact 
on the risk figure of merit. 

However we could consider another type of magnitude 
offset, not intrinsic to the SN populations, but rather the 
measurement process. If different surveys are combined, 
a miscalibration between the magnitudes can ensue, even 
if the SN absolute magnitudes are equal, due to filter or 
instrumental zeropoint offsets. The redshift dependence 
of the samples, taking the place of population drift f{z), 
can be particularly sharp, for example when combining 
a lower redshift ground-based sample with a high red- 
shift space-based sample. As one example, if these sets 
are matched at z = 0.8 with an unrecognized miscalibra- 
tion of 0.02 mag, then the cosmological parameter bias 
causes the risk figure of merit to be degraded by a factor 
2.7 (with parameters biased by up to l.Qcr). See Fig. 17 
of 35] for other matching scenarios. Overlap between the 
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sets needs to be substantial to ameliorate the degrada- 
tion. 

In general one would want to define new fit parame- 
ters for possible offsets when using multiple samples, to 
eliminate bias, but these additional parameters tend to 
increase the dispersion substantially. For example, with 
a single offset fit parameter the area figure of merit de- 



grades by a factor 2.4 without a prior on the offset; the 
factor is still 1.6 with a prior of 0.02 mag. So to add to 
the other strategies for controlling systematics, a homo- 
geneous sample over the full redshift range, or substantial 
overlap between sets, strongly improves the cosmological 
accuracy. 
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